OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference
arxiv.orgยท15h
๐Ÿ Self-hosted AI
Three Solutions to Nondeterminism in AI
blog.hellas.aiยท2dยท
Discuss: Hacker News
๐Ÿ—๏ธAI Infrastructure
A small number of samples can poison LLMs of any size
dev.toยท16hยท
Discuss: DEV
๐Ÿ Self-hosted AI
LoRA Explained: Faster, More Efficient Fine-Tuning with Docker
docker.comยท1d
๐Ÿ Self-hosted AI
RND1: Simple, Scalable AR-to-Diffusion Conversion
radicalnumerics.aiยท23hยท
Discuss: Hacker News
๐Ÿ—๏ธAI Infrastructure
Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks
machinelearning.apple.comยท1d
๐ŸŽ™๏ธWhisper
SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference
arxiv.orgยท15h
๐Ÿ—๏ธAI Infrastructure
LLM Optimization Notes: Memory, Compute and Inference Techniques
gaurigupta19.github.ioยท4dยท
Discuss: Hacker News
๐Ÿ—๏ธAI Infrastructure
The Hidden Oracle Inside Your AI: Unveiling Data Density with Latent Space Magic by Arvind Sundararajan
dev.toยท1dยท
Discuss: DEV
๐Ÿ“ฑEdge AI
Towards a Typology of Strange LLM Chains-of-Thought
lesswrong.comยท21h
๐Ÿ—๏ธAI Infrastructure
Neuro-Symbolic AI
en.wikipedia.orgยท4hยท
Discuss: Hacker News
๐Ÿง Neuromorphic Chips
Evaluating Gemini 2.5 Deep Think's math capabilities
epoch.aiยท5hยท
Discuss: Hacker News
๐Ÿ—๏ธAI Infrastructure
Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management
arxiv.orgยท1d
๐Ÿ Self-hosted AI
Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning
arxiviq.substack.comยท1dยท
Discuss: Substack
๐Ÿ—๏ธAI Infrastructure
Operationalizing Data Minimization for Privacy-Preserving LLM Prompting
arxiv.orgยท3d
๐Ÿ Self-hosted AI
Show HN: Nanowakeword โ€“ Automates custom wake word model training
github.comยท7hยท
Discuss: Hacker News
๐ŸŽ™๏ธWhisper
Real-Time Adaptive Sparsity Optimization for Edge-Deployed AI Inference Accelerators
dev.toยท9hยท
Discuss: DEV
๐Ÿ—๏ธAI Infrastructure
The key to conversational speech recognition
datasciencecentral.comยท1d
๐ŸŽคVoice Interfaces
Expanding the Action Space of LLMs to Reason Beyond Language
arxiv.orgยท15h
๐Ÿ—๏ธAI Infrastructure
[D] Anyone using smaller, specialized models instead of massive LLMs?
reddit.comยท1dยท
๐ŸŒDistributed systems